educational

Robot Wars

No, this article is not about one of those increasingly popular television shows that feature large metallic automatons bashing each other into submission with heavy, spinning, pointy things. Rather, it is a hands-on look at ways in which Webmasters can control Search Engine Spiders visiting their sites:

As is the case with all such articles, I must begin with my usual 'I am not a techno-geek, so take all of this advice with a big grain of salt, and use these techniques at your own risk' disclaimer. Having said that, this is an inside look at an often misunderstood application: the 'robots.txt' file. This is a simple text document that can help keep surfers from finding and directly entering your protected members area as well as other 'sensitive' areas of your site, and help focus attention on those parts of your site that need it and are prepared to handle it.

To make this easier to understand, consider many of the search results listings you've seen. Oftentimes the pages that you are directed to are not the site's home pages, but often 'inside' pages that can easily be taken out of context — or even out of framesets, hampering navigation and the natural 'flow' of information that the site's designer intended. Free site owners, for one example, do not really want people hitting their galleries directly, bypassing their warning pages, FPAs and other marketing tools; yet without specific instructions to the contrary, SE spiders are more than happy to provide direct links to these areas. These 'awkward' results can be avoided and manipulated through the use of the robots.txt file.

The Robots Exclusion Protocol
The mechanics of spider manipulation are carried out through the "Robots Exclusion Protocol," which allows Webmasters to tell visiting robots which areas of the site they should, and should not, visit and index. When a spider enters a site, the first thing it does is check the root directory for the robots.txt file. If it finds this file, it will attempt to follow the instructions included in it. If it doesn't find this file, it will have its way with your site, according to the parameters of the spider's individual programming.

It is vitally important that this robots.txt file be placed in your domain's root directory, i.e.: https://pornworks.com/robots.txt and should not be placed in any other sub-directory, such as https://pornworks.com/galleries/robots.txt — since it (unlike .htaccess files) won't work there because the robot simply won't look for it there, or obey it even if it finds this file outside your site's domain root directory. While I won't promise you this, that appears to mean that free-hosted and other sites that are not on their own domain will not be able to use this technique.

These non-domain sites do have an available option, however, in the use of the robots META tag. While not universally accepted, its use by spiders is now quite commonplace, and provides an alternative for those without domain root access. Here's the code:

META name="robots" content="index,follow">

META name="robots" content="noindex,follow">

META name="robots" content="index,nofollow">

META name="robots" content="noindex,nofollow"> Each listing must be on a separate line, is case-sensitive, and cannot contain blank spaces.

These four META tags illustrate the possibilities, and tell the spider whether or not to index the page this tag appears on, and whether or not to follow any links it finds on the page that this tag appears on. Of these four examples, only one should be used, and placed within the document's HEAD /HEAD tag. While some Search Engines may recognize additional parameters within these tags, the listed examples detail the most commonly accepted values. For those site's with domain root access, a simple robots.txt file is formatted thusly (but should be modified to suit your site's individual needs and directory structure):

User-agent: *
Disallow: /cgi-bin/
Disallow: /htsdata/
Disallow: /logs/
Disallow: /admin/
Disallow: /images/
Disallow: /includes/

In the above example, all robots are instructed to follow the file's instructions, as indicated by the "User-agent: *" wildcard. More advanced files could tailor the robot's actions according to its source, for example, individual spiders could be limited to those pages that are specifically optimized for the Search Engine that sent them, a subject well beyond the scope of this article, but perhaps the subject of a future follow-up.

Back to the above example, the 'Disallow:' command tells the robot not to enter or index the contents of the directories that follow this command. Each listing must be on a separate line, is case-sensitive, and cannot contain blank spaces. The rest of the site is now free for the robot to explore and index.

I hope this brief tutorial helps you to understand how robots interact with your site, and allows you to gain a degree of control over their actions. If you have any questions or comments about these techniques, click on the link below. ~ Stephen

Copyright © 2025 Adnet Media. All Rights Reserved. XBIZ is a trademark of Adnet Media.
Reproduction in whole or in part in any form or medium without express written permission is prohibited.

More Articles

opinion

WIA Profile: Lainie Speiser

With her fiery red hair and a laugh that practically hugs you, Lainie Speiser is impossible to miss. Having repped some of adult’s biggest stars during her 30-plus years in the business, the veteran publicist is also a treasure trove of tales dating back to the days when print was king and social media not even a glimmer in the industry’s eye.

Women in Adult ·
opinion

Fighting Back Against AI-Fueled Fake Takedown Notices

The digital landscape is increasingly being shaped by artificial intelligence, and while AI offers immense potential, it’s also being weaponized. One disturbing trend that directly impacts adult businesses is AI-powered “DMCA takedown services” generating a flood of fraudulent Digital Millennium Copyright Act (DMCA) notices.

Corey D. Silverstein ·
opinion

Building Seamless Checkout Flows for High-Risk Merchants

For high-risk merchants such as adult businesses, crypto payments are no longer just a backup plan — they’re fast becoming a first choice. More and more businesses are embracing Bitcoin and other digital currencies for consumer transactions.

Jonathan Corona ·
opinion

What the New SCOTUS Ruling Means for AV Laws and Free Speech

On June 27, 2025, the United States Supreme Court handed down its landmark decision in Free Speech Coalition v. Paxton, upholding Texas’ age verification law in the face of a constitutional challenge and setting a new precedent that bolsters similar laws around the country.

Lawrence G. Walters ·
opinion

What You Need to Know Before Relocating Your Adult Business Abroad

Over the last several months, a noticeable trend has emerged: several of our U.S.-based merchants have decided to “pick up shop” and relocate to European countries. On the surface, this sounds idyllic. I imagine some of my favorite clients sipping coffee or wine at sidewalk cafés, embracing a slower pace of life.

Cathy Beardsley ·
profile

WIA Profile: Salima

When Salima first entered the adult space in her mid-20s, becoming a power player wasn’t even on her radar. She was simply looking to learn. Over the years, however, her instinct for strategy, trust in her teams and commitment to creator-first innovation led her from the trade show floor to the executive suite.

Women in Adult ·
opinion

How the Interstate Obscenity Definition Act Could Impact Adult Businesses

Congress is considering a bill that would change the well-settled definition of obscenity and create extensive new risks for the adult industry. The Interstate Obscenity Definition Act, introduced by Sen. Mike Lee, makes a mockery of the First Amendment and should be roundly rejected.

Lawrence G. Walters ·
opinion

What US Sites Need to Know About UK's Online Safety Act

In a high-risk space like the adult industry, overlooking or ignoring ever-changing rules and regulations can cost you dearly. In the United Kingdom, significant change has now arrived in the form of the Online Safety Act — and failure to comply with its requirements could cost merchants millions of dollars in fines.

Cathy Beardsley ·
opinion

Understanding the MATCH List and How to Avoid Getting Blacklisted

Business is booming, sales are steady and your customer base is growing. Everything seems to be running smoothly — until suddenly, Stripe pulls the plug. With one cold, automated email, your payment processing is shut down. No warning, no explanation.

Jonathan Corona ·
profile

WIA Profile: Leah Koons

If you’ve been to an industry event lately, odds are you’ve heard Leah Koons even before you’ve seen her. As Fansly’s director of marketing, Koons helps steer one of the fastest-growing creator platforms on the web.

Women in Adult ·
Show More